Meta-interpreters for Rule-based Reasoning under Uncertainty

نویسندگان

  • Shimon Schocken
  • Tim Finin
چکیده

One of the key challenges in designing expert systems is a credible representation of uncertainty and partial belief. During the past deca.de, a number of rule-based belief languages were proposed and implemented in applied systems. Due to their quasi-probabilistic nature, the external validity of these languages is an open question. This paper discusses the theory of belief revision in expert systems through a canonical belief calculus model which is invariant across different languages. A zeta-interpreter for non-categorical reasoning is then presented. The purposes of this logic model is twofold: first, it provides a clear and concise conceptualization of belief representation and propagation in rule-based systems. Second, it serves as a working shell which can be instantiated with different belief calculi. This enables experiments to investigate the net impact of alternative belief languages on the exbernal validity of a fixed expert system. Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 1 Uncertainty and Expert Systems The ability to model uncertainty and belief revision is now considered a key challenge in designing credible expert systems. Regardless of whether the domain of expertise is medical diagnosis, venture capital, or oil exploration human experts have to cope with uncertain data and inexact decision rules, Moreover, it is now an established fact that humans, laymen and experts alike, are very poor intuitive statisticians (Tversky and Kahneman, 1341). Specifically, human judgement under uncertainty is often irrational, to the extent that rationality is equated with the axioms of utility theory and subjective probability. There have been several attempts to represent uncertainty and belief revision within the rigid framework of Iogic, with Carnap's (1954) inductive logic [5] being the most seminal treatise on the subject. Notwithstanding its significant philosophical contribution, inductive logic was not meant to serve as a practical modelling framework. And yet two decades later, Carnap's work on the theory of confirmation became the motivation for the certainty factors model a popular belief calculus which was first implemented in 1976 in the MYCIN medical diagnosis system (Shortliffe, [30]). Since then, a wide variety of belief calculi and non-categorical inference methods were developed and implemented by researchers and practitioners. By and large, these methods can be classified into two categories: probabilistic, and quasi-probabilistic. Probabilist ic methods include such models as Bayes networks (Pearl, [20]), influence diagrams (Howa.rd and Matheson, [14]), and the Dempster-Shafer theory of evidence (Shafer, [27]). These models enjoy a solid theoretical foundation; they are either consistent with the axioms of probability theory, or they extend it in a clear and explicit manner, as in the case of the DempsterShafer model. However, it is now well understood that the marriage between logical inference and probabilistic inference is rather problematic. First, it was shown by Heckerman 1131 and other authors that the modular structure of the rule-based architecture is generally inconsistent with the wholistic nature of a joint distribution function. Second, probabilistic inference in a rule-based architecture was shown to be NP-hard (Cooper, [9]). Quasi-probabilistic belief-calculi a.re only pa.rtia1ly consistent with the axCenter for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 ioms of subjective probability. These calculi include MYCIN7s certainty factors model (Shortliffe, [30]), the ad-hoc Bayesian model used in PROSPECTOR (Duda et al, [ l l]) , and an assortment of similar calculi which are essentially isomorphic although they may differ is some details. Following the great popularity of such rule-based shells as EMYCIN, M.1, and AL/X, quasi-probabilistic belief calculi became the de-facto method of handling uncertainty in applied expert systems. And yet the algebraic structure of these pragmatic models is quite obscure, and their limitations and full potential are not well-understood by practitioners and knowledge engineers. This paper has three purposes. First, it gives a formal description of the structure of a belief calculus and how it may be integrated with the overall architecture of a rule-based system. Second, the paper presents a methodology designed to test the controversial validity of alternative belief calculi. The question of whether or not a belief calculus credibly represents (or improves) human judgement under uncertainty is of utmost importance, and it may be answered only through experimentation with human subjects. In order to run such experiments, one needs a canonical rule-based architecture which can easily accommodate different belief calculi. This leads to the third purpose of the paper, which is the development of a Prolog meta-interpreter, called SOLVE, for non-categorical reasoning. SOLVE is useful in that (a) it gives a clear computational definition of a belief calculus, and, (b) it provides a platform for carrying out experiments with alternative belief calculi. Although the presentation of SOLVE involves a certain degree of logic programming, the major concern of this paper is the theory of rule-based belief calculi, and the software engineering issues related to their integration with rule-based logic models. The implementation details of SOLVE and related predicates are presented in a separate appendix. This technical material is intended for readers who are interested in Prolog. 2 Rule-based inference and Belief Languages The mat hematical and cognitive underpinnings of rulebased (production) systems are well-known, and the reader is referred to Davis and King [22] Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 and to Newel1 (181 for extensive discussions. Due to its proximity to firstorder predicate calculus, the rational basis of categorical rule-based inference is normally unchallenged. This validity, however, does not extend naturally to applications involving uncertain facts and heuristic inference rules. Under such conditions, rule-based inference becomes an inexact, non-categorical, classification procedure, designed to map an observed phenomenon on a set of one or more explaining hypotheses (Cohen, 171). This inexact matching algorithm is carried out by applying modus ponens repeatedly to a set of rules of the form IF e THEN h WITH DEGREE OF BELIEF Bel, which, from now on, we denote e -+ h # Bel'. The postfix Bel is a degree of belief, which, broadly speaking, reflects an expert's confidence in the logical entailment associated with the implication e -+ h. The problem, simply put, is this: given the prior belief in h and all the degrees of belief that parameterize rules and facts which ultimately imply h, how does one compute the posterior belief in h? In expert systems, this is typically accomplished by some sort of a belief calculus. As the rule-based inference-engine processes rules which ultimately imply an hypothesis, a belief calculus is applied to update the posterior belief in this hypothesis. The process normally terminates when the belief in one or more hypotheses exceeds a certain pre-defined cutoff value. Therefore, a noncategorical belief calculus may be viewed as a "scoring" algorithm, a term coined by Cooper [8]. This algorithm accepts a set of inexact rules and a set of uncertain data, and goes on to "score" a set of competing hypotheses, i.e. compute their posterior beliefs. There exist conditions under which the resulting scores are probabilities, but this is not always the case. According to Shafer and Tversky 1281, the building-blocks of a belief language are syntax, calculus, and semantics. In the context of rule-based inference, syntax corresponds to the set of degrees of belief which parameterize uncertain facts, inexact rules, and prospective hypotheses. The degrees of belief associated with rules are elicited from domain experts as the knowledge-base is being constructed. The degrees of belief which parameterize observed or suspected pieces of evidence are obtained interactively through consultation. Posterior degrees of belief are computed through a set of operators collec'throughout the paper, e and h stand for a piece of evidence and a n hypothesis, respectively Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 tively known as a belief calculus. We take the position that the semantics of the language consists of either a normative or a descriptive argument which justifies the validity of the syntax and calculus dimensions of the language. 2.1 A Canonical Belief Calculus In order to propagate degrees of belief in a rule-based architecture consisting of uncertain facts and inexact rules, a belief calculus must be capable of handling three generic types s f reasoning: Boolean conditioning, sequential propagation, and parallel combination. This section gives canonical definitions of each of these cases. Elsewhere in the paper we present languagedependant instantiations of these models and give their corresponding logic programming implementations. Let h, el, and ez be an huypothesis and two pieces of evidence with current beliefs Bel(h), Bel(el), and Bel(e2), respectively. A non-categorical inference mechanism must be capable of computing the posterior belief in h, denoted Bel(hl.), in light of any recursive combination of the following generic evidential rela.tionships: Boolean cond i t i on ing : ( e l OR e2) -> h # Be1 ( e l AND e 2 ) -> h # Be1 s e q u e n t i a l p ropaga t ion : e l -> e2 # B e l l e2 -> h # Be12 P a r a l l e l combinat ion: e l -> h # Bell e2 -> h # Be12 The exact specification of how to compute the posterior belief in h in any one of the above circumsta.nces is precisely the definition of a belief calculus. Although the details of such specifications va.ry greatly across different belief languages, the basic structure of their underlying calculi is quite invariant. Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 This observation leads to the notion of a canonical belief calculus, whose three components are described next. Boolean Conditioning: Consider the rules (el or e2) -+ h #Bell and (el and e2) -+ h #BeE2. The degrees of belief Bell and Be12 represent the strengths of the rules if both el-and el are known to be certain. But what if one or both of these pieces of evidence is uncertain? In such cases, the belief calculus first computes the current belief associated with the premise of each rule, i.e. Bel((el and e2)) and Bel((el or e2)). Technically speaking, this computation is carried out through the template functions F-and and F-or, respectively: Once the current belief in a rule's premise is established trough Boolean conditioning, the posterior belief in the rule's conclusion ca.n be computed through sequential propaga.tion. Sequential Propagation: Rule-based belief calculi make the implicit assumption that the "actual" degree of belief in a rule has to cha.nge when the belief in the rule's premise changes. Specifically, let e -4 h #Bel(h, e) be a rule specifying that "given e (with certainty), h is implied to a degree of belief Bel(h, e)," and let the current belief in e be Bel(e). When a rule-based inference engine operates on a knowledge-base, the premise e might be either (a) a terminal fact whose prior belief Bel(e) is specified by the user, or, (b) an intermediate sub-hypothesis whose current belief Bel(e1.) was already computed by the system. Whichever category e falls in, the "a.ctua1" degree of belief in the rule, denoted Bell(h,e), is computed through a. va.ria.nt of the following sequential propagation function, Fs: Bell(h, e) = Fs(Bel(e) , Bel(h, e)) Center for Digital Economy Research Stem School of Business Working Paper IS-89-69 The function F-s is monotonically increasing in both variables Bel(e) and Bel(h,e). Therefore, F-s is sometimes referred to in the A1 literature as an "attenuation function," designed to carry over the uncertainty associated with a rule's premise into the uncertainty associated with the rule itself. pa.ralle1 Combination: Let h be an hypothesis with current degree of belief Bel(h) and let el -+ h #Bel(h, el) and e2 4 h #Bel(h, e2) be two rules that bear evidence on h independently. The combined, posterior belief in h in light of {el, e 2 ) is given by the following binary parallel combination function, F-p: Bel(h, {el, e2)) = F-p(Bel(h), Bel(h., el), BeE(h, e2)) (4) (it is implicitly assumed that Bel(h, el) and Bel(h, e2) were already attenuated by F s ) . In order to free the inference process from order and clustering effects, the function F-p is normally required to be commutative and associative. If these requirements are satisfied, the binary F-p function can be extended recursively to an n-ary parallel combination function. The details of this extension are straightforward. We now proceed to describe the C F calculus and the likelihood-ratio Bayesian calculus. These models are presented verbatim, and no attempt is made here to either defend their cognitive appeal or argue for or against their normative justification. Such analyses were carried out by Adams (11, Heckerman 1131, Grosof [12], Schocken and Icleindorfer 1251, and other authors. 2.2 The Certainty Factors Language Following its initial implementation in MY CIN, the certainty-factors calculus has evolved into several forms, a.11 of which may be easily incorporated into the architecture described in this paper. The calculus discussed here adheres to the original model, described in detail by Buchanan and Shortliffe 141. In the additive CF syntax, a dia.gnostic rule of the form e -4 h #CF(hle) means that e increases the belief in h by the ma,gnitude CF(h1e) which Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 varies from -1 to 1. If e is irrelevant to h, CF(h1e) = 0. The extreme case of e being sufficiently convincing to confirm (disconfirm) h in certainty is modeled through CF(h1e) = 1 (CF(h1e) = -1). There are basically two types of certainty-factors. The CF7s associated with rules (e.g. paci f is t (X) -+ democrat(X) #0.9) are elicited from a domain expert when the rule-base is being constructed. The CF's associated with uncertain facts (e.g. p c i fisb(jm) #O.E;) are supplied through consultation. Boolean Conditioning: Consider the categorical disjunctive rule (el or e2) -+ h which reads: either one of the two pieces of evidence el or ez (known in certainty) can alone establish the hypothesis h. How does one extend this rule to situations in which either el or ez are uncertain? this question is complicated by the observation that the uncertainty associated with these facts is not a probability, but, rather, an abstract measure of human belief. Iiahneman and Miller [15] have argued that, under these circumstances, the most reasonable rule for Boolean combination is the one used in fuzzy logic (Zadeh, [35]). This rule, which was implemented in MYCIN, sets the belief in a colljunction (disjunction) to the minimal (maximal) belief in its constituents: Sequential combination: The CF associated with the diagnostic rule e -+ -h #CF(hIe) is elicited from a domain expert under the assumption that the premise e is known wit11 certainty. \Vhen the belief in e is less than certainty, the CF calculus attenuates the rule's degree of belief through the following sequential propagation function: CF(h le) CF(e ) if CF(e) > 0 CFt(h/e) = { otherwise Parallel cornhination: When two rules el -+ h #CF(hlel) and e2 -+ h #CE(hle2) bear evidence on IL independently, their compound increased Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 belief in h in light of {el, e2} is computed through a binary combination function, defined as follolvs: I CF(hle1) + CF(hle2) . (1 CF(hlel)) if both CF's are positive { -(ICF(hle~)l + ICF(hle2)l . (1 ICF(hlel)j) if both CF's are negative if the CF's have mixed signs 2.3 The Bayesian Language In Bayesian languages, a rule of the form h --t e #Be1 reads: the hypothesis h causes the evidence e with a degree of belief Bel. There exist several different interpretation of Bel. Some Bayesian systems elicit and propagate conditional probabilities of the form Be1 = P(el h). A more balanced Bayesian design would record not only P(elh) but also ~ ( e l x ) , leading to the two-place degree of belief Be1 = [P(el h), ~ ( e l x ) ] . Finally, the likelihood-ratio Bayesian syntax consists of likelihood-ratios of the form Be1 = el h)/p(elx). If the prior-odds on h , P(h)/P('7;), is known, then Bayes rule dictates that the presence of e will change the posterior odds on h to P (h ) /P (x ) ~ ( e l h ) / ~ ( e l f j ; ) . Hence, the Bayesian syntax is multiplicative, unlike the C F syntax, which is additive. Recall that our ultimate purpose is to develop a canonical meta-interpreter which can accommodate a wide variety of different belief calculi. With that in mind, we'll focus on a general Bayesian language in which the degree of belief Be1 which parameterizes the rule h --, e #Be1 is taken to be the 3-place list [P(A), P(elh), ~ ( e l x ) ] . If e is a terminal piece of evidence, the degree of belief in e is taken to be Bel = P(e). Boolean Conditioning: In qua.si-probabilistic Bayesian systems, e.g. PROSPECTOR, the current belief in conjunctions and disjunctions involving uncertain propositions is computed through the same fuzzy logic conventions used in MYCIN: Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 Sequential propa.gation: The literature contains several heuristic procedures for Bayesian sequential belief update, e.g. Jeffsies rule of conditioning (Shafer, 1261) and PROSPECTOR'S interpolaiion function (Duda et al, [ll]). For the sake of brevity, we choose to describe here a simple interpolation function, discussed by Wise [36]. Suppose the knowledge-base contains the rule h -, e #[P(h), P(elh), ~ ( e l A ) ] and we find out through consultation that the piece of evidence e obtains with the current belief P(e). Before we can calculate the impact of e on the posterior belief in h, we attenuate the rule's degree of belief as follows: This gives the "actual" rule's degree of belief, [P(h), P1(elh), ~ ' ( e l x ) ] . Note that (11) is a weighted average of P(elh) and P ( ~ l h ) , weighted by P(e) and P(z). (12) is similar. Parallel combination: Let h + el #Bell, . . , h + en #Beln be n causal rules with (already attenuated) degrees of belief Bel; = [P(h), P(el h), P(elK)]. The posterior belief in h in light of the evidence {el, . . ,en) is computed through the following version of (the commutative and associa.tive) Bayes rule: products-odds = P(ellh), . .. . P(enlh) ~ ( e t l 7 i ) ' ' p(en17i) P(h) odds = proclucts-odds . P(E) Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 The posterior belief in h, P(hl.), may be derived through the simple transformation: odds P(h1.) = 1 + odds 3 On the Validity of Belief Languages As the previous section illustrates, the C F and the Bayesian languages offer two different representations of uncertainty and partial belief. At the same time, one would hope that the behavior of a CF-based expert system would be compatible with that of a Bayesian expert system, all other things held equal (including the expert and tlze knowledge-base). This hypothesis can be tested only through experimentation. The design of such experiments and the computational tools which are necessary to support them are discussed in this section. Consider the familiar problem of rating prospective dates and managing a little black book. Suppose a person, denoted hereafter dater, wishes to determine whether 01not another person is a good match for a blind-date, based on a single telephone conversation. For the sake of simplicity, let's assume that the dater's rationale is represented t,hrougll the following CF-oriented knowledgebase: good,looking(X) o r s m a r t (x) -> date(X) # 0 .8 . This knowledge-base has the following interpretation: [I] is a wishful (and inexact) conjecture that blind-daters typically nial<e and then learn that they Center for Digital Economy Research Stem School of Business IVorking Paper IS-89-69 should have known better. 121 is an inexact rule of thumb which models the dater's social preferences. [3] is a certain fact about Leslie. S/he sounds good over the telephone. Fact [4] is an inexact estimate of Leslie's IQ. We see that not unlike other domains of expertise, the dater's "knowledge" and perception of reality are heuristic and subjective, respectively. In the rule-based architecture of 11-41, this non-determinism is represented by the degrees of belief following the # symbol. Note, however, that barring these numbers, [I-41 may be readily translated to a standard logic model. Let's assume that this model is implemented in Prolog, and consider the following query: Prolog's response to this query will be the laconic and rather unproductive result "Yes." Under the given semantics, this means: "go ahead and date Leslie." We think that most dakers would reject this black and white dichotomy in favor of a finer and more informative matcher. In particular, let's assume that (a) the # degrees of belief in [I-41 were reinstated, and, (b) a certainty-factors oriented meta-interpreter called SOLVE were available. Under these conditions, the original query may be recast as the following meta-query: solve (date(les1ie) , Bel) ? To which Prolog will answer:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prolog Meta-interpreters for Rulebased Inference under Uncertainty

Uncertain facts and inexact rules can be represented and processed in standard Prolog through meta-interpretation. This requires the specification of appropriate parsers and belief calculi. We present a meta-interpreter that takes a rule-based belief calculus as an external variable. The certainty-factors calculus and a heuristic Bayesian belief-update model are then implemented as stand-alone ...

متن کامل

Universit a Di Pisa Veriication of Meta-interpreters Veriication of Meta-interpreters

A novel approach to the veriication of meta-interpreters is introduced. We apply a general purpose veriication method for logic programs, proposed in 28], to the case study of the Vanilla and other logic meta-interpreters. We extend the standard notion of declarative correctness, and design a criterion for proving correctness of meta-interpreters in a general sense, including amalgamated and re...

متن کامل

Verification of Meta-Interpreters

A novel approach to the veriication of meta-interpreters is introduced. We apply a general purpose veriication method for logic programs, proposed in 28], to the case study of the Vanilla and other logic meta-interpreters. We extend the standard notion of declarative correctness, and design a criterion for proving correctness of meta-interpreters in a general sense, including amalgamated and re...

متن کامل

Control Structures of Rule-Based Agent Languages

An important issue when deening a rule-based agent programming language is the design of interpreters for these programming languages. Since these languages are all based on some notion of rule, an interpreter must provide some means of selection from a set of such rules. We provide a concrete and intuitive ordering on rules on which this selection can be based. This ordering is inspired by the...

متن کامل

Meta-interpreters for rule-based inference under uncertainty

One of the key challenges in designing expert systems is a credible representation of uncertainty and partial belief. During the past decade, a number of rule-based belief languages were proposed and implemented in applied systems. Due to their quasi-probabilistic nature, the external validity of these languages is an open question. This pape~ discusses the theory of belief revision in expert s...

متن کامل

Upside-down deduction

Over the recent years, several proposals were made to enhance database systems with automated reasoning. In this article we analyze two such enhancements based on meta-interpretation. We consider on the one hand the theorem prover Satchmo, on the other hand the Alexander and Magic Set methods. Although they achieve different goals and are based on distinct reasoning paradigms, Satchmo and the A...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1989